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Abstract 

We bridge the gap between functional evaluators and abstract ma- 
chines for the A-calculus, using closure conversion, transformation into 
continuation-passing style, and defunctionalization of continuations. 

We illustrate this bridge by deriving Krivine's abstract machine from 
an ordinary call-by-name evaluator and by deriving an ordinary call-by- 
value evaluator from Felleisen ct al.'s CEK machine. The first derivation 
is strikingly simpler than what can be found in the literature. The second 
one is new. Together, they show that Krivine's abstract machine and the 
CEK machine correspond to the call-by-name and call-by-value facets of 
an ordinary evaluator for the A-calculus. 

We then reveal the denotational content of Hannan and Miller's CLS 
machine and of Landin's SECD machine. We formally compare the corre- 
sponding evaluators and we illustrate some relative degrees of freedom in 
the design spaces of evaluators and of abstract machines for the A-calculus 
with computational effects. 

For the purpose of this work, we distinguish between virtual machines, 
which have an instruction set, and abstract machines, which do not. The 
Categorical Abstract Machine, for example, has an instruction set, but 
Krivine's machine, the CEK machine, the CLS machine, and the SECD 
machine do not; they directly operate on A-terms instead. We present the 
abstract machine that corresponds to the Categorical Abstract Machine. 
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1 Introduction and related work 



In Hannan and Miller's words [21, Section 7], there are fundamental differences 
between dcnotational definitions and definitions of abstract machines. While a 
functional programmer tends to be familiar with dcnotational definitions [34], 
he typically wonders about the following issues: 

• How does one design an abstract machine? How were existing abstract 
machines, starting with Landin's SECD machine, designed? How does 
one make variants of an existing abstract machine? How does one extend 
an existing abstract machine to a bigger source language? How does one 
go about designing a new abstract machine? How does one relate two 
abstract machines? 

• How does one prove the correctness of an abstract machine? Assuming it 
implements a reduction strategy, should one prove that each of its transi- 
tions implements a part of this strategy? Or should one characterize it in 
reference to a given evaluator, or to another abstract machine? 

• Why do some abstract machines operate on A-tcrms directly whereas oth- 
ers operate on compiled A-terms expressed with an instruction set? 

A variety of answers to these questions can be found in the literature. Landin 
invented the SECD machine as an implementation model for functional lan- 
guages [25]. Plotkin proved its correctness in connection with an evaluation 
function [29, Section 2]. Krivine discovered an abstract machine from a logi- 
cal standpoint [24]. Crcgut proved its correctness in reference to a reduction 
strategy and he generalized it from weak to strong normalization [6] . Curicn dis- 
covered the Categorical Abstract Machine from a categorical standpoint [5, 7]. 
Fcllciscn ct al. invented the CEK machine from an operational standpoint [16]. 
Hannan and Miller discovered the CLS machine from a proof-theoretical stand- 
point [21]. Many people derived, invented, or re-discovered Krivine's machine. 
Many others proposed modifications of existing machines. And recently, Hardin, 
Maranget, and Pagano introduced a method to extract the reduction strategy 
of a machine by extracting axioms from its transitions and structural rules from 
its architecture [22]. 

In this article, we propose one simple answer to all the questions above. 
We present a correspondence between functional evaluators and abstract ma- 
chines based on a two-way derivation: closure conversion, transformation into 
continuation-passing style (CPS), and defunctionalization. This two-way deriva- 
tion lets us connect each of the machines above with an evaluator, and makes it 
possible to echo variations over the evaluator into variations over the abstract 
machine, and vice versa. The evaluator puts the reduction strategy of the ma- 
chine in the open. The abstract machine makes the evaluation steps explicit in 
a transition system. In addition, we also distinguish between abstract machines 
and virtual machines in the sense that virtual machines have an instruction set 
and abstract machines do not; instead, they directly operate on source terms 
and do not need a compiler. 
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Prerequisites: We use ML as a meta-language, and we assume a basic fa- 
miliarity with Standard ML and reasoning about ML programs. In particular, 
given two pure ML expressions e and e ' we write e = e ' to express that e and 
e' are observationally equivalent. Most of our implementations of the abstract 
machines raise compiler warnings about non-exhaustive matches. These are 
inherent to programming abstract machines in an ML-like language. The warn- 
ings could be avoided with an option type or with an explicit exception, at the 
price of readability and direct relation to the usual mathematical specifications 
of abstract machines. 

It would be helpful to the reader to know at least one of the machines 
considered in the rest of this article, be it Kri vine's machine, the CEK machine, 
the CLS machine, the SECD machine, or the Categorical Abstract Machine. It 
would also be helpful to have already seen a A-interpreter written in a functional 
language [19, 30, 33, 37]. 

We make use of the CPS transformation [11, 31]: a term is CPS-transformed 
by naming all its intermediate results, sequcntializing their computation, and 
introducing continuations. Plotkin was the first to establish the correctness of 
the CPS transformation [29]. 

We also make use of Reynolds's dcfunctionalization [30]: defunctionalizing 
a program amounts to replacing each of its function spaces by a data type 
and an apply function; the data type enumerates all the function abstractions 
that may give rise to inhabitants of this function space [14]. In particular, clo- 
sure conversion amounts to replacing each of the function spaces in expressible 
and denotable values by a tuple, and inlining the corresponding apply func- 
tion. Nielsen, Banerjee, Heintze, and Riecke have established the correctness of 
dcfunctionalization [2, 28]. 

Overview: The rest of this article is organized as follows. We first consider 
a call-by-name and a call-by-value evaluator, and we present the correspond- 
ing machines, which are Krivine's machine and the CEK machine. We then 
consider the CLS machine and the SECD machine, and we present the corre- 
sponding evaluators. We finally turn to the Categorical Abstract Machine. For 
simplicity, we do not cover laziness and sharing, but they come for free by CPS 
transformation and threading of a heap of updateable thunks. 

2 Call-by-name, call-by- value, and the A-calculus 

We first go from a call-by-name evaluator to Krivine's abstract machine (Sec- 
tion 2.1) and then from the CEK machine to a call-by- value evaluator (Sec- 
tion 2.2). The derivation steps consist of closure conversion, transformation 
into continuation-passing style, and defunctionalization of continuations. 

Krivine's abstract machine operates on de Bruijn-encoded A-terms, and the 
CEK machine operates on A-terms with names. Starting from the corresponding 
evaluators, it is simple to construct a version of Krivine's abstract machine 
that operates on A-terms with names, and a version of the CEK machine that 
operates on de Bruijn-encoded A-terms (Section 2.3). 
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2.1 From a call-by-name evaluator to Krivine's machine 

Krivinc's abstract machine [6] operates on de Bruijn-encoded A-terms. In this 
representation, identifiers are represented by their lexical offset, as traditional 
since Algol 60 [38]. 

datatype term = IND of int (* de Bruijn index *) 
I ABS of term 
I APP of term * term 

Programs are closed terms. 

2.1.1 A higher-order and compositional call-by-name evaluator 

Our starting point is the canonical call-by-name evaluator for the A-calculus [33, 
35]. This evaluator is compositional in the sense of denotational semantics [32, 
35, 39] and higher order. It is compositional because it solely defines the mean- 
ing of each term as a composition of the meaning of its parts. It is higher 
order because the data types denval and expval contain functions. Denotable 
values are thunks and expressible values are functions [36]. Environments are 
represented as lists of denotable values. A program is evaluated in an empty 
environment. 

structure EvalO 
= struct 

datatype denval = THUNK of unit -> expval 

and expval = FUNCT of denval -> expval 

(* eval : term * denval list -> expval *) 
fun eval (IND n, e) 

= let val (THUNK thunk) = List. nth (e, n) 

in thunk () 

end 

I eval (ABS t, e) 

= FUNCT (fn v => eval (t, v :: e)) 
I eval (APP (tO, tl) , e) 
= let val (FUNCT f) = eval (tO, e) 
in f (THUNK (fn () => eval (tl, e))) 
end 

(* main : term -> expval *) 
fun main t 

= eval (t, nil) 

end 

2.1.2 From higher-order functions to closures 

In EvalO, the function spaces in the data types of denotable and expressible 
values arc only inhabited by instances of the A-abstractions f n v => eval (t , 
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v :: e) in the meaning of abstractions, and fn () => eval (tl, e) in the mean- 
ing of applications. Each of these A-abstractions has two free variables: a 
term and an environment. We defunctionalize these function spaces into clo- 
sures [14, 25, 30], and we inline the corresponding apply functions. 

structure Evall 
= struct 

datatype denval = THUNK of term * denval list 
and expval = FUNCT of term * denval list 

(* eval : term * denval list -> expval *) 
fun eval (IND n, e) 

= let val (THUNK (t , e')) = List. nth (e, n) 

in eval (t , e ' ) 

end 

I eval (ABS t, e) 

= FUNCT (t, e) 
I eval (APP (tO, tl) , e) 
= let val (FUNCT (t, e')) = eval (tO, e) 
in eval (t , (THUNK (tl, e)) :: e') 
end 

(* main : term -> expval *) 
fun main t 

= eval (t, nil) 

end 

The definition of an abstraction is now Evall. FUNCT (t, e) instead of fn v => 
EvalO.eval (t, v :: e) , and its use is now Evall . eval (t, (Evall. THUNK (tl, 
e)) :: e>) instead of f (EvalO. THUNK (fn () => EvalO.eval (tl, e))). Simi- 
larly, the definition of a thunk is now Evall . THUNK (tl , e) instead of EvalO . THUNK 
(fn () => EvalO.eval (tl, e)) and its use is Evall. eval (t, e') instead of 
thunk (). 

The following proposition is a corollary of the correctness of defunctional- 
ization. 

Proposition 1 (full correctness) For any ML value p : term denoting a pro- 
gram, evaluating EvalO. main p yields a value FUNCT f and evaluating Evall .main 
p yields a value FUNCT (t, e) such that 

f = fn v => Evall. eval (t, v :: e) 
2.1.3 CPS transformation 

We transform eval into continuation-passing style. 1 Doing so makes it tail 
recursive. 

1 Since programs are closed, applying List. nth cannot fail and therefore it denotes a total 
function. We thus keep it in direct style [13]. 
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structure Eval2 
= struct 

datatype denval = THUNK of term * denval list 
and expval = FUNCT of term * denval list 

(* eval : term * denval list * (expval -> 'a) -> 'a *) 
fun eval (IND n, e, k) 

= let val (THUNK (t, e')) = List. nth (e, n) 

in eval (t , e' , k) 

end 

I eval (ABS t, e, k) 

= k (FUNCT (t, e)) 
I eval (APP (tO, tl) , e, k) 

= eval (tO, e, fn (FUNCT (t, e')) 

=> eval (t, (THUNK (tl, e)) :: e', k) ) 

(* main : term -> expval *) 
fun main t 

= eval (t, nil, fn v => v) 

end 

The following proposition is a corollary of the correctness of the CPS trans- 
formation. (Here observational equivalence reduces to structural equality over 
ML values of type expval.) 

Proposition 2 (full correctness) For any ML value p : term denoting a pro- 
gram, 

Eval 1. main p = Eval2.main p 
2.1.4 Defunctionalizing the continuations 

The function space of the continuation is inhabited by instances of two A- 
abstractions: the initial one in the definition of Eval2.main, with no free vari- 
ables, and one in the meaning of an application, with three free variables. To 
dcfunctionalizc the continuation, we thus define a data type cont with two sum- 
mands and the corresponding apply_cont function to interpret these summands. 

structure Eval3 
= struct 

datatype denval = THUNK of term * denval list 
and expval = FUNCT of term * denval list 
and cont = CONTO I C0NT1 of term * denval list * cont 

(* eval : term * denval list * cont -> expval *) 
fun eval (IND n, e, k) 

= let val (THUNK (t, e')) = List. nth (e, n) 

in eval (t , e ' , k) 

end 

I eval (ABS t, e, k) 
= apply_cont (k, FUNCT (t , e)) 
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I eval (APP (tO, tl) , e, k) 
= eval (tO, e, C0NT1 (tl, e, k) ) 
and apply_cont (CONTO, v) 
= v 

I apply_cont (C0NT1 (tl, e, k) , FUNCT (t , e')) 
= eval (t, (THUNK (tl, e)) :: e\ k) 

(* main : term -> expval *) 
fun main t 

= eval (t, nil, CONTO) 

end 

The following proposition is a corollary of the correctness of defunctionaliza- 
tion. (Again, observational equivalence reduces here to structural equality over 
ML values of type expval.) 

Proposition 3 (full correctness) For any ML value p : term denoting a pro- 
gram, 

Eval2.main p = Eval3.main p 

We identify that cont is a stack of thunks, and that the transitions are those 
of Krivine's abstract machine. 

2.1.5 Krivine's abstract machine 

To obtain the canonical definition of Krivine's abstract machine, we abandon the 
distinction between denotable and expressible values and we use thunks instead, 
we represent the defunctionalized continuation as a list of thunks instead of a 
data type, and we inline apply.cont. 

structure Eval4 
= struct 

datatype thunk = THUNK of term * thunk list 

(* eval : term * thunk list * thunk list -> term * thunk list *) 
fun eval (IND n, e, s) 

= let val (THUNK (t , e')) = List. nth (e, n) 

in eval (t , e ' , s) 

end 

I eval (ABS t, e, nil) 

= (ABS t, e) 
I eval (ABS t, e, (f, e') :: s) 

= eval (t, (THUNK (t ' , e')) :: e, s) 
I eval (APP (tO, tl) , e, s) 

= eval (tO, e, (tl, e) :: s) 

(* main : term -> term * thunk list *) 
fun main t 

= eval (t, nil, nil) 

end 
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The following proposition is straightforward to prove. 

Proposition 4 (full correctness) For any ML value p : term denoting a pro- 
gram, 

Eval3.main p = Eval4.main p 

For comparison, the canonical definition of Krivine's abstract machine is as 
follows [6, 20, 24], where t denotes terms, v denotes expressible values, e denotes 
environments, and s denotes stacks of expressible values: 

• Source syntax: 

t ::= n | Xt \ t 0 ti 

• Expressible values (closures): 

v ::= [t, e] 

• Initial transition, transition rules, and final transition: 





t = 


> (t, nil, nil) 




e, s) = 


> (t, e', s), where [t, e'] = nth(e, n) 


(Xt, e, [?, e' 


,.s) = 


> (t, [?, e'\ ::e,s) 


(to t\, 


e,s) - 


> (to, e, [h, e] :: s) 


(Xt, e 


nil) = 


> [t,e] 



Variables n are represented by their de Bruijn index, and the abstract machine 
operates on triples consisting of a term, an environment, and a stack of express- 
ible values. 

Each line in the canonical definition matches a clause in Eval4. We con- 
clude that Krivine's abstract machine can be seen as a defunctionalized, CPS- 
transformed, and closure-converted version of the standard call-by-name evalu- 
ator for the A-calculus. This cvaluator evidently implements Hardin, Maranget, 
and Pagano's K strategy [22, Section 3]. 

2.2 From the CEK machine to a call-by-value evaluator 

The CEK machine [15, 16] operates on A-terms with names and distinguishes 
between values and computations in their syntax (i.e., it distinguishes trivial 
and serious terms, in Reynolds's words [30]). 

datatype term = VALUE of value 

I COMP of comp 
and value = VAR of string (* name *) 

I LAM of string * term 
and comp = APP of term * term 

Programs are closed terms. 
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2.2.1 The CEK abstract machine 



Our starting point reads as follows [18, Figure 2, page 239], where t denotes 
terms, w denotes values, v denotes expressible values, k denotes evaluation 
contexts, e denotes environments, and s denotes stacks of expressible values: 

• Source syntax: 

t ::= w | toh 
w ::= x | Xx.t 

• Expressible values (closures) and evaluation contexts: 

v ::= [x, t, e] 

k ::= stop | fun(u, k) | arg(t, e, k) 

• Initial transition, transition rules (two kinds), and final transition: 



t 


^ eval 


(t, mt, Stop) 


(w, e, k) 


^ eval 


(k, 7(w,e)) 


(toh, e, k) 


eval 


(t 0 , e, arg(fi,e, k)) 


(arg(ii,e,fc), v) 


cont 


(h, e, fun(w, k)} 


(funQz, t, e],k), v) 


^ cont 


(t, e[x i— > v], k) 


(stop, v) 


cont 


V 



where j(x, e) = e(x) 

j(Xx.t, e) — [x, t, e] 

Variables x are represented by their name, and the abstract machine consists 
of two mutually recursive transition functions. The first transition function 
operates on triples consisting of a term, an environment, and an evaluation 
context. The second operates on pairs consisting of an evaluation context and 
an expressible value. Environments are extended in the fun-transition, and 
consulted in 7. The empty environment is denoted by mt. 
This specification is straightforward to program in ML: 

signature ENV 
= sig 

type 'a env 

val mt : 'a env 

val lookup : ' a env * string -> ' a 
val extend : string * 'a * 'a env -> 'a env 
end 

Environments are represented as a structure Env : ENV containing a represen- 
tation of the empty environment mt, an operation lookup to retrieve the value 
bound to a name in an environment, and an operation Env. extend to extend an 
environment with a binding. 



10 



structure EvalO 
= struct 

datatype expval = CLOSURE of string * term * expval Env.env 
datatype evaluation_context = STOP 

I ARG of term * expval Env.env * evaluation_context 

I FUN of expval * evaluation_context 

(* eval : term * expval Env.env * evaluation. context -> expval *) 
fun eval (VALUE v, e, k) 

= continue (k, eval_value (v, e)) 
I eval (COMP (APP (tO, tl)), e, k) 

= eval (tO, e, ARG (tl, e, k) ) 
and eval_value (VAR x, e) 

= Env. lookup (e, x) 
I eval_value (LAM (x, t) , e) 

= CLOSURE (x, t, e) 
and continue (STOP, w) 

= w 

I continue (ARG (tl, e, k) , w) 

= eval (tl, e, FUN (w, k) ) 
I continue (FUN (CLOSURE (x, t, e) , k) , w) 

= eval (t, Env. extend (x, w, e) , k) 

(* main : term -> expval *) 
fun main t 

= eval (t, Env.mt, STOP) 

end 

2.2.2 Refunctionalizing the evaluation contexts into continuations 

We identify that the data type evaluation_context and the function continue 
are a defunctionalizcd representation. The corresponding higher-order evaluator 
reads as follows. As can be observed, it is in continuation-passing style. 

structure Evall 
= struct 

datatype expval = CLOSURE of string * term * expval Env.env 

(* eval : term * expval Env.env * (expval -> 'a) -> 'a *) 
fun eval (VALUE v, e, k) 

= k (eval_value (v, e)) 
I eval (COMP (APP (tO, tl)), e, k) 
= eval (tO, 
e, 

fn (CLOSURE (x, t, e')) 
=> eval (tl, 
e, 

fn w 

=> eval (t , 

Env. extend (x, w, e) , 
k))) 
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and eval_value (VAR x, e) 
= Env. lookup (e, x) 
I eval_value (LAM (x, t) , e) 
= CLOSURE (x, t, e) 

(* main : term -> expval *) 
fun main t 

= eval (t, Env.mt, fn w => w) 

end 

The following proposition is a corollary of the correctness of defunctional- 
ization. (Observational equivalence reduces here to structural equality over ML 
values of type expval.) 

Proposition 5 (full correctness) For any ML value p : term denoting a pro- 
gram, 

EvalO.main p = Evall.main p 
2.2.3 Back to direct style 

CPS-transforming the following direct-style evaluator yields the evaluator of 
Section 2.2.2 [9]. 

structure Eval2 
= struct 

datatype expval = CLOSURE of string * term * expval Env. env 

(* eval : term * expval Env. env -> expval *) 
fun eval (VALUE v, e) 
= eval_value (v, e) 
I eval (COMP (APP (tO, tl)), e) 
= let val (CLOSURE (x, t, e')) = eval (tO, e) 
val w = eval (tl, e) 
in eval (t, Env. extend (x, w, e)) 
end 

and eval_value (VAR x, e) 
= Env. lookup (e, x) 
I eval_value (LAM (x, t) , e) 
= CLOSURE (x, t, e) 

(* main : term -> expval *) 
fun main t 

= eval (t, Env.mt) 

end 

The following proposition is a corollary of the correctness of the direct-style 
transformation. (Again, observational equivalence reduces here to structural 
equality over ML values of type expval.) 
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Proposition 6 (full correctness) For any ML value p : term denoting a pro- 
gram, 

Evall.main p = Eval2.main p 
2.2.4 Prom closures to higher-order functions 

We observe that the closures, in Eval2, are defunctionalized representations with 
an apply function inlined. The corresponding higher-order evaluator reads as 
follows. 

structure Eval3 
= struct 

datatype expval = CLOSURE of expval -> expval 

(* eval : term * expval Env.env -> expval *) 
fun eval (VALUE v, e) 
= eval_value (v, e) 
I eval (COMP (APP (tO, tl)), e) 
= let val (CLOSURE f) = eval (tO, e) 
val w = eval (tl, e) 
in f w 
end 

and eval_value (VAR x, e) 
= Env. lookup (e, x) 
I eval_value (LAM (x, t) , e) 
= CLOSURE (fn w => eval (t , Env. extend (x, w, e))) 

(* main : term -> expval *) 
fun main t 

= eval (t, Env.mt) 

end 



The following proposition is a corollary of the correctness of defunctional- 
ization. 

Proposition 7 (full correctness) For any ML value p : term denoting a pro- 
gram, evaluating Eval2.main p yields a value CLOSURE (x, t, e) and evaluating 
Eval3.main p yields a value CLOSURE f such that 

fn w => Eval2.eval (t , Env. extend (x, w, e)) = f 



2.2.5 A higher-order and compositional call-by-value evaluator 

The result in Eval3 is a call-by- value evaluator that is compositional and higher- 
order. This call-by-value evaluator is the canonical one for the A-calculus [33, 
35]. We conclude that the CEK machine can be seen as a defunctionalized, 
CPS-transformcd, and closure-converted version of the standard call-by-value 
evaluator for A-terms. 
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2.3 Variants of Krivine's machine and of the CEK ma- 
chine 



It is easy to construct a variant of Krivine's abstract machine for A-terms with 
names, by starting from a call-by-name cvafuator for A-terms with names. Sim- 
ilarly, it is easy to construct a variant of the CEK machine for A-tcrms with 
de Bruijn indices, by starting from a call- by- value evaluator for A-terms with 
indices. It is equally easy to start from a call-by-valuc evaluator for A-terms 
with dc Bruijn indices and no distinction between values and computations; 
the resulting abstract machine coincides with Hankin's eager machine [20, Sec- 
tion 8.1.2]. 

Abstract machines processing A-terms with de Bruijn indices often resolve 
indices with transitions: 



Compared to the evaluator of Section 2.1.1, page 5, the evaluator corresponding 
to this machine has List. nth inlined and is not compositional: 



2.4 Conclusion 

We have shown that Krivine's abstract machine and the CEK abstract machine 
are counterparts of canonical evaluators for call-by-name and for call-by-value 
A-terms, respectively. The derivation of Krivine's machine is strikingly simpler 
than what can be found in the literature. That the CEK machine can be derived 
is, to the best of our knowledge, new. That these two machines are two sides 
of the same coin is also new. We have not explored any other aspect of this 
call-by-name/call- by- value duality [8]. 

Using substitutions instead of environments or inlining one of the standard 
computational monads (state, continuations, etc. [37]) in the call-by-value eval- 
uator yields variants of the CEK machine that have been documented in the 
literature [15, Chapter 8]. For example, inlining the state monad in a monadic 
evaluator yields a state-passing evaluator. The corresponding abstract machine 
has one more component to represent the state. In general, inlining monads 
provides a generic recipe to construct arbitrarily many new abstract machines. 
It does not seem as straightforward, however, to construct a "monadic abstract 
machine" and then to inline a monad; we are currently studying the question. 

On another note, one can consider an evaluator for strictness-annotated 
A-terms — represented either with names or with indices, and with or without 
distinction between values and computations. One is then led to an abstract 
machine that generalizes Krivine's machine and the CEK machine [12]. 



(0, v :: e, s 
(n + 1, v :: e, s 



■) => v :: s 

) => (n, e, s) 



fun eval (IND 0, denval :: e, s) 
= . . . denval . . . 
I eval (IND n, denval :: e, s) 
= eval (IND (n - 1) , e, s) 
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Finally, it is straightforward to extend Krivine's machine and the CEK ma- 
chine to bigger source languages (with literals, primitive operations, conditional 
expressions, block structure, recursion, etc.), by starting from evaluators for 
these bigger languages. For example, all the abstract machines in "The essence 
of compiling with continuations" [18] are defunctionalized continuation-passing 
evaluators, i.e., interpreters. 

In the rest of this article, we illustrate further the correspondence between 
evaluators and abstract machines. 

3 The CLS abstract machine 

The CLS abstract machine is due to Hannan and Miller [21]. In the following, t 
denotes terms, v denotes expressible values, c denotes lists of directives (a term 
or the special tag ap), e denotes environments, I denotes stacks of environments, 
and s denotes stacks of expressible values. 

• Source syntax: 

t ::= n | At | t 0 ti 

• Expressible values (closures): 

v ::= [t, e] 

• Initial transition, transition rules, and final transition: 







t 


=> (t :: nil, nil :: nil, nil) 


(Xt :: c, e 




s) 


=> (c, I, [t, e] :: s) 


((t 0 ti) :: c, e 




s) 


=> (to t\ :: ap :: c, e :: e :: I, s) 


(0 :: c, (v :: e) 




s) 


=> (c, I, v :: s) 


(n + 1 :: c, (v :: e) 




s) 


=> (n :: c, e :: I, s) 


(ap :: c, I, v :: [t, 


e}:: 


s) 


=> (t :: c, (v :: e) :: I, s) 


(nil, nil, 


v :: 


s) 


=> V 



Variables n are represented by their de Bruijn index, and the abstract machine 
operates on triples consisting of a list of directives, a stack of environments, and 
a stack of expressible values. 

3.1 The CLS machine 

Hannan and Miller's specification is straightforward to program in ML: 

datatype term = IND of int (* de Bruijn index *) 
I ABS of term 
I APP of term * term 

Programs are closed terms. 
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structure EvalO 
= struct 

datatype directive = TERM of term 
I AP 

datatype env = ENV of expval list 
and expval = CLOSURE of term * env 

(* run : directive list * env list * expval list -> expval *) 
fun run (nil, nil, v : : s) 
= v 

I run ((TERM (IND 0)) :: c, (ENV (v :: e)) :: 1, s) 

= run (c , 1 , v : : s) 
I run ((TERM (IND n) ) :: c, (ENV (v :: e)) :: 1, s) 

= run ((TERM (IND (n - 1))) :: c, (ENV e) :: 1, s) 
I run ((TERM (ABS t)) :: c, e :: 1, s) 

= run (c, 1, (CLOSURE (t , e)) :: s) 
I run ((TERM (APP (tO, tl))) :: c, e :: 1, s) 

= run ((TERM tO) :: (TERM tl) : : AP : : c , e : : e : : 1 , s) 
I run (AP :: c, 1, v :: (CLOSURE (t , ENV e)) :: s) 

= run ((TERM t) :: c, (ENV (v :: e)) :: 1, s) 

(* main : term -> expval *) 
fun main t 

= run ((TERM t) :: nil, (ENV nil) :: nil, nil) 

end 

3.2 A disentangled definition of the CLS machine 

In the definition of Section 3.1, all the possible transitions are meshed together 
in one recursive function, run. Instead, let us factor run into several mutually 
recursive functions, each of them with one induction variable. 
In this disentangled definition, 

• run_c interprets the list of control directives, i.e., it specifies which transi- 
tion to take if the list is empty, starts with a term, or starts with an apply 
directive. If the list is empty, the computation terminates. If the list starts 
with a term, run_t is called, caching the term in the first parameter. If 
the list starts with an apply directive, run_a is called. 

• run_t interprets the top term in the list of control directives. 

• run_a interprets the top value in the current stack. 

The disentangled definition reads as follows: 

structure Evall 
= struct 

datatype directive = TERM of term 
I AP 
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datatype env = ENV of expval list 
and expval = CLOSURE of term * env 



(* run_c : directive list * env list * expval list -> expval *) 
fun run_c (nil, nil, v : : s) 
= v 

I run_c ( (TERM t) :: c, 1, s) 

= run_t (t, c, 1, s) 
I run_c (AP : : c, 1, s) 

= run_a (c, 1, s) 
and run_t (IND 0, c, (ENV (v :: e)) :: 1, s) 

= run_c (c, 1, v : : s) 
I run_t (IND n, c, (ENV (v :: e)) :: 1, s) 

= run_t (IND (n - 1), c, (ENV e) :: 1, s) 
I run_t (ABS t, c, e : : 1, s) 

= run_c (c, 1, (CLOSURE (t , e)) :: s) 
I run_t (APP (tO, tl) , c, e :: 1, s) 

= run_t (tO, (TERM tl) : : AP : : c , e : : e : : 1 , s) 
and run_a (c, 1, v :: (CLOSURE (t, ENV e)) :: s) 

= run_t (t, c, (ENV (v :: e)) :: 1, s) 

(* main : term -> expval *) 
fun main t 

= run_t (t, nil, (ENV nil) :: nil, nil) 

end 

Proposition 8 (full correctness) For any ML value p : term denoting a pro- 
gram, 

EvalO.main p = Evall.main p 

Proof: By fold-unfold [4]. The invariants are as follows. For any ML values t 
: term, e : expval list, and s : expval list, 

Evall.run_c (c, 1, s) = EvalO.run (c, 1, s) 
Evall.run.t (t , c, 1, s) = EvalO.run ((TERM t) :: c, 1, s) 
Evall.run_a (c, 1, s) = EvalO.run (AP :: c, 1, s) 

□ 



3.3 The evaluator corresponding to the CLS machine 

In the disentangled definition of Section 3.2, there are three possible ways to 
construct a list of control directives (nil, cons'ing a term, and cons'ing an apply 
directive) . We could specify these constructions as a data type rather than as a 
list. Such a data type, together with run_c, is in the image of defunctionalization 
(run_c is the apply functions of the data type) . The corresponding higher-order 
evaluator is in continuation-passing style. Transforming it back to direct style 
yields the following evaluator: 
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structure Eval3 
= struct 

datatype env = ENV of expval list 
and expval = CLOSURE of term * env 

(* run_t : term * env list * expval list -> env list * expval list *) 
fun run_t (IND 0, (ENV (v :: e)) :: 1, s) 
= (1, v : : s) 
I run_t (IND n, (ENV (v :: e)) :: 1, s) 

= run_t (IND (n - 1), (ENV e) :: 1, s) 
I run_t (ABS t, e : : 1, s) 

= (1, (CLOSURE (t, e)) : : s) 
I run_t (APP (tO, tl) , e :: 1, s) 
= let val (1, s) = run_t (tO, e : : e : : 1, s) 
val (1, s) = run_t (tl, 1, s) 
in run_a (1, s) 
end 

and run_a (1, v :: (CLOSURE (t , ENV e)) :: s) 
= run_t (t, (ENV (v :: e)) :: 1, s) 

(* main : term -> expval *) 
fun main t 

= let val (nil, v : : s) = run_t (t , (ENV nil) :: nil, nil) 
in v 
end 

end 

The following proposition is a corollary of the correctness of defunctional- 
ization and of the CPS transformation. (Here observational equivalence reduces 
to structural equality over ML values of type expval.) 

Proposition 9 (full correctness) For any ML value p : term denoting a pro- 
gram, 

Evall.main p = Eval3.main p 

As in Section 2, this evaluator can be made compositional by rcfunctional- 
izing the closures into higher-order functions and by factoring the resolution of 
dc Bruijn indices into an auxiliary lookup function. 

We conclude that the evaluation model embodied in the CLS machine is 
a call-by-valuc interpreter threading a stack of environments and a stack of 
intermediate results with a caller-save strategy (witness the duplication of en- 
vironments on the stack in the meaning of applications) and with a left-to-right 
evaluation of sub-terms. In particular, the meaning of a term is a partial endo- 
function over a stack of environments and a stack of intermediate results. 

4 The SECD abstract machine 

The SECD abstract machine is due to Landin [25]. In the following, t denotes 
terms, v denotes expressible values, c denotes lists of directives (a term or the 
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special tag ap), e denotes environments, s denotes stacks of expressible values, 
and d denotes dumps (list of triples consisting of a stack, an environment and 
a list of directives) . 

• Source syntax: 

t ::= a; | Xx.t \ t 0 ti 

• Expressible values (closures): 

v ::= [a;, t, e] 



• Initial transition, transition rules, and final transition: 





t = 


> (nil, mt, t :: nil, nil) 


(s, e, x :: 


c, d) = 


> (e(x) :: s, e, c, d) 


(s, e, (Xx.t) : 


c, d) = 


> ([x, t, e] :: s, e, c, d) 


(s, e, (toh) : 


c, d) = 


> (s, e, t\ :: to :: ap :: c, d) 


([x, t, e'] :: v :: s, e, ap : 


c,d) = 


> (nil, e'[x i— > v], t :: nil, (s, e, c) :: d) 


(v :: s, e, nil, (s' , e' , d' 


)::d) = 


> (v :: s', e' , c' , d) 


(v :: s, e, nil 


, nil) = 


> V 



Variables x are represented by their name, and the abstract machine operates 
on quadruples consisting of a stack of expressible values, an environment, a list 
of directives, and a dump. Environments are consulted in the first transition 
rule, and extended in the fourth. The empty environment is denoted by mt. 

4.1 The SECD machine 

Landin's specification is straightforward to program in ML. Programs are closed 
terms. Environments are as in Section 2.2. 

datatype term = VAR of string (* name *) 
I LAM of string * term 
I APP of term * term 

structure EvalO 
= struct 

datatype directive = TERM of term 
I AP 

datatype value = CLOSURE of string * term * value Env.env 

fun run (v :: nil, e', nil, nil) 
= v 

I run (s, e, (TERM (VAR x) ) :: c, d) 

= run ((Env. lookup (e, x)) :: s, e, c, d) 
I run (s, e, (TERM (LAM (x, t))) :: c, d) 

= run ((CLOSURE (x, t, e)) :: s, e, c, d) 
I run (s, e, (TERM (APP (tO, tl))) :: c, d) 

= run (s, e, (TERM tl) :: (TERM tO) :: AP :: c, d) 
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I run ( (CLOSURE (x, t, e')) :: v :: s, e, AP :: c, d) 
= run (nil, Env. extend (x, v, e'), (TERM t) :: nil, (s, e, c) 

I run (v :: nil, e', nil, (s, e, c) :: d) 
= run (v : : s , e , c , d) 

(* main : term -> value *) 
fun main t 

= run (nil, Env.mt, (TERM t) :: nil, nil) 

end 



4.2 A disentangled definition of the SECD machine 

As in the CLS machine, in the definition of Section 4.1, all the possible transi- 
tions are meshed together in one recursive function, run. Instead, we can factor 
run into several mutually recursive functions, each of them with one induction 
variable. These mutually recursive functions arc in dcfunctionalized form: the 
one processing the dump is an apply function for the data type representing 
the dump (a list of stacks, environments, and lists of directives), and the one 
processing the control is an apply function for the data type representing the 
control (a list of directives). The corresponding higher-order evaluator is in 
continuation-passing style with two nested continuations and one control delim- 
iter [11, 17]. The delimiter resets the control continuation when evaluating the 
body of a A-abstraction. (More detail is available in a technical report [10].) 

4.3 The evaluator corresponding to the SECD machine 

The direct-style version of the evaluator from Section 4.2 reads as follows: 

structure Eval4 
= struct 

datatype value = CLOSURE of string * term * value Env. env 

(* eval : term * value list * value Env. env *) 
(* -> value list * value Env. env *) 

fun eval (VAR x, s, e) 

= ((Env. lookup (x, e)) :: s, e) 
I eval (LAM (x, t) , s, e) 

= ((CLOSURE (x, t, e)) :: s, e) 
I eval (APP (tO, tl) , s, e) 
= let val (s, e) = eval (tl, s, e) 
val (s, e) = eval (tO, s, e) 
in apply (s, e) 
end 

and apply ((CLOSURE (x, t, e')) :: v :: s, e) 
= let val (v : : nil, _) 
= reset (fn () 

=> eval (t, nil, Env. extend (x, v, e'))) 

in (v : : s, e) 
end 
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(* main : term -> value *) 
fun main t 

= let val (v : : nil, _) 
= reset (fn () 

=> eval (t, nil, Env.mt)) 

in v 
end 

end 

The following proposition is a corollary of the correctness of defunctional- 
ization and of the CPS transformation. (Here observational equivalence reduces 
to structural equality over ML values of type value.) 

Proposition 10 (full correctness) For any ML value p : term denoting a 
program, 

EvalO.main p = Eval4.main p 

As in Sections 2 and 3, this evaluator can be made compositional by rcfunc- 
tionalizing the closures into higher-order functions. 

We conclude that the evaluation model embodied in the SECD machine 
is a call-by-valuc interpreter threading a stack of intermediate results and an 
environment with a callee-save strategy (witness the dynamic passage of envi- 
ronments in the meaning of applications), a right-to- left evaluation of sub-terms, 
and a control delimiter. In particular, the meaning of a term is a partial endo- 
function over a stack of intermediate results and an environment. Furthermore, 
this evaluator evidently implements Hardin, Maranget, and Pagano's L strat- 
egy, i.e., right-to-left call by value, without us having to "guess" its inference 
rules [22, Section 4]. 

The denotational content of the SECD machine puts a new light on it. For 
example, its separation between a control register and a dump register is ex- 
plained by the control delimiter in the evaluator. Removing this control de- 
limiter gives rise to an abstract machine with a single stack component for 
control — not by a clever change in the machine itself, but by a straightforward 
simplification in the corresponding evaluator. 

5 Variants of the CLS machine and of the SECD 
machine 

It is straightforward to construct a variant of the CLS machine for A-terms with 
names, by starting from an evaluator for A-tcrm with names. Similarly, it is 
straightforward to construct a variant of the SECD machine for A-terms with 
de Bruijn indices, by starting from an evaluator for A-term with indices. In the 
same vein, it is simple to construct call- by-name versions of the CLS machine 
and of the SECD machine, by starting from call-by-name evaluators. It is also 
simple to construct a properly tail recursive version of the SECD machine, and 
to extend the CLS machine and the SECD machine to bigger source languages, 
by extending the corresponding evaluator. 
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6 The Categorical Abstract Machine 



What is the difference between an abstract machine and a virtual machine? In 
a companion article [f ] , we propose to distinguish them based on the notion of 
instruction set: A virtual machine has an instruction set whereas an abstract 
machine does not. An abstract machine directly operates on a A-term, but a 
virtual machine operates on a compiled representation of a A-term, expressed 
using an instruction set. 

The Categorical Abstract Machine [5], for example, has an instruction set — 
categorical combinators — and therefore (despite its name) it is a virtual ma- 
chine, not an abstract machine. In contrast, Krivine's machine, the CEK ma- 
chine, the CLS machine, and the SECD machine arc all abstract machines, not 
virtual machines, since they directly operate on A-terms. In this section, we 
present the abstract machine corresponding to the Categorical Abstract Ma- 
chine (CAM). We start from the evaluation model embodied in the CAM, as 
obtained in the companion article. 

6.1 The evaluator corresponding to the CAM 

The evaluation model embodied in the CAM is an interpreter threading a stack 
with its top element cached in a register, representing environments as express- 
ible values (namely nested pairs linked as lists), with a caller-save strategy 
(witness the duplication of the register on the stack in the meaning of appli- 
cations below), and with a left-to-right evaluation of sub-terms. In particular, 
the meaning of a term is a partial endofunction over the register and the stack. 
This evaluator reads as follows: 

datatype term = IND of int (* de Bruijn index *) 

I ABS of term 

I APP of term * term 

I NIL 

I CONS of term * term 

I CAR of term 

I CDR of term 

Programs are closed terms. 

structure EvalO 
= struct 

datatype expval = NULL 

I PAIR of expval * expval 

I CLOSURE of expval * (expval * expval list -> 
expval * expval list) 

(* access : int * expval * 'a -> expval * 'a *) 
fun access (0, PAIR (vl, v2) , s) 

= (v2, s) 
I access (n, PAIR (vl, v2) , s) 

= access (n - 1, vl, s) 
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(* eval : term * expval * expval list -> expval * expval list *) 
fun eval (IND n, v, s) 
= access (n, v, s) 
I eval (ABS t, v, s) 

= (CLOSURE (v, fn (v, s) => eval (t, v, s)), s) 
I eval (APP (tO, tl) , v, s) 
= let val (v, v' : : s) = eval (tO, v, v : : s) 

val (v\ (CLOSURE (v, f)) : : s) = eval (tl, v> , v :: s) 
in f (PAIR (v, v>) , s) 
end 

I eval (NIL, v, s) 

= (NULL, s) 
I eval (CONS (tl, t2) , v, s) 
= let val (v, v' : : s) = eval (tl, v, v : : s) 
val (v, v' : : s) = eval (t2, v' , v :: s) 
in (PAIR (v ; , v) , s) 
end 

I eval (CAR t, v, s) 
= let val (PAIR (vl, v2) , s) = eval (t , v, s) 
in (vl, s) 
end 

I eval (CDR t, v, s) 
= let val (PAIR (vl, v2) , s) = eval (t , v, s) 
in (v2, s) 
end 

(* main : term -> expval *) 
fun main t 

= let val (v, nil) = eval (t , NULL, nil) 
in v 
end 

end 

This evaluator evidently implements Hardin, Maranget, and Pagano's X strat- 
egy [22, Section 6]. 



6.2 The abstract machine corresponding to the CAM 

As in Sections 2, 3, and 4, we can closure-convert the evaluator of Section 6.1 by 
dcfunctionalizing its expressible values, transform it into continuation-passing 
style, and defunctionalize its continuations. The resulting abstract machine 
reads as follows, where t denotes terms, v denotes expressible values, k denotes 
evaluation contexts, and s denotes stacks of expressible values. 

• Source syntax: 

t ::= n \ Xt \ t 0 ti | nil | (constii 2 ) | (cari) | (cdri) 
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• Expressible values (unit value, pairs, and closures) and evaluation con- 
texts: 

v ::= null | (vi, v 2 ) \ [v, t] 

k ::= CONTO | C0NTl(i, fc) | C0NT2(fc) | C0NT3(t, fc) | 
C0NT4(fc) | CQNT5(fc) | C0NT6(fc) 

• Initial transition, transition rules (two kinds), and final transition: 



t 


= ^ > eval 


(t, null, ntf, CONTO) 


(n, v, s, k) 


= ^ > eval 


(fc, 7(n } v), 5) 


(Xt, v, s, k) 


= ^ > eval 


(fc, [v, t], s) 


(nil, v, s, k) 


^ eval 


(fc, null, s) 


(t 0 h, v, s, k) 


= ^ > eval 


(t 0 , v, v :: s, C0NTl(ti, fc)) 


((cons t\ t 2 ), V, s, k) 


= ^ > eval 


(il, f , f :: S, C0NT3(t 2 , fc)) 


((cart), v, s, k) 


^ eval 


(t, V , s, C0NT5(fc)) 


((cdrt), v, s, k) 


^ eval 


(t, V, 5, CQNT6(fc)) 


(C0NTl(t, k), v, v' :: s) 


= ^ > cont 


(t, u', v :: 5, C0NT2(fc)) 


(C0NT2(fc), v', [v, t] :: s) 


= ^ > cont 


(£, (u, O, s, fc) 


(C0NT3(ti, fc), v, v' :: s) 


= ^ > cont 


(il, v', v :: s, C0NT4(fc)) 


(C0NT4(fc), v, v' :: s) 


= ^ > cont 


(fc, (u', v), s> 


(C0NT5(fc), (v u v 2 ), s) 


^"cont 


(fc, v X) s) 


(C0NT6(fc), (v u v 2 ), s) 


^ cont 


(fc, V 2 , S) 


(CONTO, v, nil) 


^ cont 





where 7(0, (wi,w 2 )) = v 2 

-f(n, (vi, v 2 )) = j(n- l,vi) 

Variables n are represented by their de Bruijn index, and the abstract machine 
consists of two mutually recursive transition functions. The first transition 
function operates on quadruples consisting of a term, an expressible value, a 
stack of expressible values, and an evaluation context. The second transition 
function operates on triples consisting of an evaluation context, an expressible 
value, and a stack of expressible values. 

This abstract machine embodies the evaluation model of the CAM. Natu- 
rally, more intuitive names could be chosen instead of CONTO, C0NT1, etc. 
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7 Conclusion and issues 



We have presented a constructive, mechanical, and generic correspondence be- 
tween functional evaluators and abstract machines. This correspondence builds 
on off-the-shelf program transformations: closure conversion, CPS transforma- 
tion, defunctionalization, and inlining. 2 We have shown how to reconstruct 
known machines (Krivine's machine, the CEK machine, the CLS machine, and 
the SECD machine) and how to construct new ones. Conversely, we have re- 
vealed the denotational content of known abstract machines. We have shown 
that Krivine's abstract machine and the CEK machine correspond to canon- 
ical evaluators for the A-calculus. We have also shown that they are dual of 
each other since they correspond to call-by-name and call-by-value evaluators 
in the same direct style. 3 In terms of denotational semantics [26, 32], Krivine's 
machine and the CEK machine correspond to a standard semantics, whereas 
the CLS machine and the SECD machine correspond to a stack semantics of 
the A-calculus. Finally, we have exhibited the abstract machine corresponding 
to the CAM, which puts the reader in a new position to answer the recurrent 
question as to whether the CLS machine is closer to the CAM or to the SECD 
machine. 

It seems to us that this correspondence between functional evaluators and 
abstract machines builds a reliable bridge between denotational definitions and 
definitions of abstract machines. On the one hand, it allows one to identify the 
denotational content of an abstract machine in the form of a functional inter- 
preter. On the other hand, it gives one a precise and generic recipe to construct 
arbitrarily many new variants of abstract machines (e.g., with substitutions or 
environments, or with stacks) or of arbitrarily many new abstract machines, 
starting from an evaluator with any given computational monad [27]. 

Acknowledgments: We are grateful to Malgorzata Biernacka and Hcnning 
Korsholm Rohdc for timely comments. 
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